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Abstract . 


The  size  and  power  of  Student's  x-test  are  discussed  under 
weaker  than  normal  conditions.  It  is  shown  that  assuming  only 
a  symmetry  condition  for  the  null  hypothesis  leads  to  effective 
bounds  on  the  dispersion  of  the  t-statistic.  (The  symmetry 
condition  is  weak  enough  to  include  all  cases  of  independent  but 
not  necessarily  identically  distributed  observations,  each 
symmetric  about  the  origin.)  The  connection  between  Student's 
test  and  the  usual  non-parametric  tests  is  examined,  as  well  as 
power  considerations  involving  Winsorization  and  permutation 
tests.  Simultaneous  use  of  different  one-sample  tests  is  also. 
discussed. 
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1.  The  Geometry  of  Student's  One-sample  t-statistic. 

The  distribution  of  Student's  one-sampie  t-statistic 


T  = 
n 


.  Jx x*  //iL<xi- 


50 


/n 


n-1 


is  usually  derived  under  normal  sampling  theory,  with  the 
assumed  to  be  independent,  identically  distributed  normal  random 
variables , 


X£  N(u,o2), 


1  —  l,2,«t*,n* 


In  his  essentially  geometrical  derivation  of  Student's  distribution, 
Fisher  [6]  showed  that  the  rotational  symmetry  of  the  random 
vector 

x  =  (x1,x2,x3,. . . ,xn> 

under  the  null  hypothesis  y  =  0  is  sufficient  to  yield  the 

standard  null  distribution  for  T  . 

n 

To  be  more  precise,  let  U  be  the  unit  vector  U  =  X/j|Xjj, 
so  that 


ui  ■  xi/[4 


i  ■  1  )2  ^  ^ n • 


Under  the  null  hypothesis  X^  NC0,a2),  U  will  be  uniformly 

distributed  on  the  surface  of  Sn,  the  unit  sphere  in  Euclidean 
_n 

n-space  E  , 

n  7 

S  =  (u:  l  uf  =  1}. 
n  i=l  1 

For  any  set  A  on  Sn,  P(UeA)  =  X{A},  where  X  is  the  usual  measure 

of  n-1  dimensional  "area"  on  S  .  normalized  so  that  \{S  }  =  1. 

n  n 


/ 


-  2  - 


[This  follows  from  the  fact  that  the  density  of  X  in  En  depends 
only  on  ||  X  j| .  3 

Student’s  statistic  is  a  monotonic  function  of 


n  0  n 

v  v  2  _  r 


s  *  y  x //  j  xf  =  i  u. , 
n  i=l  n/Ji=l  1  i=l  1 


S  /  -ii-i).  If  we  let 


,  l  i 

/k’/k’ 


represent  the  unit  main  diagonal,  then 


-  /n  Cos  6  , 
n  n 

where  is  the  angle  between  U  and  e,  (see  Figure  1),  so  that 
Sn  is  a  decreasing  function  of  0n- 


Figure  1 


3 


The  distribution  for  S  now  follows  from  the  known  formula 

n 

for  the  area  of  a  spherical  cap  on  Sn«  In  particular,  if  we  wish 

to  choose  s„  such  that 
n  ,a 


PtS„  >  sn  >  = 
n  —  n ,  a 


it  is  equivalent  to  find  the  angular  radius  6n  a  of  a  spherical 

cap  C  on  S  having 
n ,  a  n 


\{C  }  =  o. 

n.a 


The  rejection  set  for  student's  one-sided  t-test  is 

where  C  has  radius  e  and  center  e.  The  value 
n,a  n,a 

given  by  s„  =  Sn  Cos  6 

n ,  a  n,o 

For  reasonable  values  of  n  and  a,  the  critical 
tends  to  be  quite  large.  If  o  =  .025  for  instance, 
following  vable  of  values: 


then  UeC 


n.a 


s  is 
n,a 


“8le  6n,a 
we  have  the 


n  = 

6 

11 

26 

51 

- 

e  = 

n,o 

41° 

55° 

68° 

74° 

90° 

Figure  1  is  misleading  since  it  shows  Cn  o  entirely  contained  in 

the  positive  orthant  of  S^.  As  a  crucial  part  of  our  discussion 

we  will  see  that  ordinarily  C  will  extend  far  outside  the 

n  ,o 

positive  orthant.  For  example,  when  n  =  20  and  a  ^  .05,  a 

20 

contains  the  center  points  of  60,460  of  the  2  orthants  [c.f. 
Section  3]. 
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2 .  A  Summary  of  This  Paper. 

The  geometry  of  Section  1  shows  that  the  normal  theory  for 
the  null  hypothesis  distribution  cf  Student's  t-statistic  remains 
valid  under  the  weaker  assumption  of  rotational  symmetry,  i.e. 
that  U  =  X/||X||  is  uniformly  distributed  over  Sn«  Thus,  for 
example,  the  null  hypothesis  that  X  is  uniformly  distributed 
within  some  sphere  centered  at  the  origin  can  be  tested  at  level 
a  by  rejecting  for  values  of  Tn  greater  than  the  tabled  upper  a 
point  of  Student's  distribution  (with  n-1  degrees  of  freedom,  in 
the  standard  terminology). 

Unfortunately  the  usual  sampling  procedures  almost  never* 
yield  rotational  symmetry  for  the  normalized  vector  U  except  in 
the  case  Xi  xHd  N(0,c2).  If,  for  insta  »,  the  X^^  are  indepen¬ 
dently  +1  or  -1  with  probabilities  y,  then  U  is  always  the  center 
point  of  one  of  the  orthants  of  Sfi. 

The  central  purpose  of  this  paper  is  to  discuss  Student's  t- 
statistic  under  a  much,  weaker  symmetry  condition,  which  is 
satisfied  under  the  null  hypotheses  of  many  standard  sampling 
situations : 

Definition:  The  random  vector  U  =  (U^ ,U^ , . . • ,Un>  is  said  to  have 
ORTHANT  SYMMETRY  if  it  has  the  same  distribution  as 

U.  =  (5-.U.  ,60U«,,. . .  ,6  U  )  for  every  choice  cf  6.  =  +1,  i  =  1,2,.  ..,n. 
o  x  x  z  z  n  n  i 

*A  very  special  "lucky"  case  is  given  in  Section  5.  It  is  possible 
to  construct  examples  where  has  the  t  distribution  without  U 
having  rotational  symmetry. 


In  particular,  orthant  symmetry  obtains  for  U  =  X/||  X  |{  when¬ 
ever  the  components  X^  of  X  are  independent  and  each  has  a 
symmetric  distribution  about  the  origin.  It  is  not  necessary 
that  the  components  have  .identical  distributions. 

Our  main  results  are  presented  in  Section  3,  and  can  be 

roughly  paraphrased  as  follows :  orthant  symmetry  guarantees  that 

n  fin  t 

Student's  statistic,  in  the  form  S  =  J  X./J  £  xf,  is  less 

n  i=l  7  »i=l  1  ,  n 

dispersed  about  the  origin  than  the  random  variable  —  £  A . , 

/n  i=! 

where  the  A^  are  independent  and  equal  +1  or  -1  with  probabilities 
j.  That  is,  among  all  cases  of  orthant  symmetry,  the  centered 
binomial  case  is  the  worst,  in  a  sense  to  be  described.  We 
suggest  that  the  size  of  Student's  one-sample  t-test  is  robust 
under  the  null  hypothesis  of  orthant  symmetry,  and  as  a  matter  of 
fact,  the  type  I  error  tends  to  decrease  from  the  nominal  a  level 
under  such  "bad"  conditions  as  the  X^  having  Cauchy  distributions. 

Sections  4  and  5  contain  heuristic  discussions  of  this  point, 
as  well  as  an  Edgeworth-type  expansion  to  help  assess  the  magni¬ 
tude  of  the  decrease.  Section  5  is  particularly  concerned  with 
the  effects  of  long-tailed  error  laws,  such  as  the  Cauchy,  on  the 
Student's  ratio.  As  an  aid  to  intuition,  a  particularly  tractable 
long-tailed  error  law  is  introduced  and  examined  in  detail. 

Orthant  symmetry  is  preserved  under  many  familiar  statistical 
operations:  taking  signs,  ranks,  censoring,  Winsorizing,  etc. 

In  Section  6  we  use  this  fact  to  discuss  the  sign  test  and 
Wilcoxon's  signed  rank  test  as  "generalized  Student's  tests".* 

*Our  definition  of  this  term  is  not  that  of  Hajek  [8]. 
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Section  7  extends  tl  is  concept  to  Winsorization  and  permutation 
tests,  and  shows  how  orthant  symmetry  allows  some  ‘'cheating”  for 
increased  power,  that  is  looking  at  the  data  before  choosing  the 
test  statistic,  without  compromising  the  a  level. 

The  use  of  more  than  one  test  on  the  same  data,  for  instance 
Student's  test  and  the  sign  test,  is  discussed  in  Section  8,  and 
a  method  of  evaluating  the  o  level  of  the  simultaneous  testing 
procedure  is  suggested.  The  question  of  conditional  versus 
unconditional  tests ,  which  is  largely  ignored  in  most  of  the 
paper,  is  discussed  in  Section  9  in  relation  to  another  geometri¬ 
cal  distribution,  that  of  the  angle  between  X  and  a  vector  other 
than  e.  We  conclude  with  a  discussion  of  references.  Mathemati¬ 
cal  details  are  collected  in  an  appendix  to  the  paper. 

3.  The  Main  Results. 

We  will  work  with  Student's  statistic  in  the  form* 
n 

7  X:.  Our  main  assumDtion  will  be  that  X  =  (X, ^X_,...,X  ) 
^  l  -  1  2  *  *  n 

has  orthant  symmetry  as  defined  in  Section  2,  and  to  avoid  trivial¬ 
ities  we  also  assume  that  P(X^  =  0)  =  0  for  i  =  l,2,...,n. 

Let  be  the  probability  distribution  of  U  =  X/J  X  (|  on  the 

unit  sphere  S  (so  that  A  =  A,  the  uniform  distribution,  if 
r  n  n 

X ^  N(0,o2)).  Orthant  symmetry  is  equivalent  to  the  statement 

that  A  is  identical  over  each  of  the  2n  orthants  of  S  .  In 
n  n 

*Use  of  S_  rather  than  the  traditional  T  almost  obviates  the 
need  for  special  tables  in  xhe  standard  case  N(0,o 

The  upper  5%  point  of  Sg  for  instance  is  1.640,  as  compared  to 
1.645  for  a  N(0,1)  variable,  (c.f.  Section  4). 
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particular  we  can  consider  only  the  positive  orthant  Sn, 


Sn  =  U  =  —  * ^n>  :  Ci  >  °»  Ci  =  1}* 


and  define  the  probability  measure  X*  on  by 


X+{A>  =  2nX  {A} 
n  r. 


for  ACS  .  We  see  that  X  determines  X  on  all  of  S  via  the 
n  n  n 

orthant  symmetry.  In  the  case  X  =  X  we  have  X+  =  X+,  the 

n  n 

uniform  distribution  on  S*. 

+  n 

Definition:  Let  C  c  S_.  Then  Sf  =  £  where  the  A.  are 

■  n  ^  i 

independent  random  variables  taking  values  +1  or  -1  with  proba¬ 
bilities  y,  will  be  called  a  generalized  binomial  random  variable. 

The  case  S  =  —  )  A.  will  be  called  the  centered  binomial 

e  /n  i*l  1 
random  variable. 

The  following  simple  lemma  is  basic  to  our  results. 

Lemma:  Under  orthant  symmetry  Sn  is  a  mixture  of  generalized 
binomial  random  variables  with  X*  as  the  mixing  distribution. 


That  is,  P(Sn  <  s) 


•L 


P(Sr  <  s)dX  {£}  for  every  value  of  s. 
K  n 


The  lemma  is  only  a  statement  of  the  fact  that  if  we  condi¬ 
tion  on  the  vector  of  normalized  absolute  values  (?1,C2 » • • • »5n> 


defined  by 


5  =  (|U1|,|U2|,...,|Un|)  , 


then  orthant  symmetry  guarantees  that  each  of  the  2  possible 


unit  vectors 


U  =  ^1^1»^2^2’*  *  ’  ’  ^  n  ^n  ^ 


where 


6  .  =  +1 
i  — 


i  **  1,2,...  ,-n , 


is  equally  likely. 

Notice  that  the  lemma  expresses  the  distribution  of  Sn  in 
terms  of  an  integrand  P(S^  <  s)  that  does  not  depend  on  the 
distribution  of  the  observations  X.  (This  distribution  enters 
only  through  the  induced  measure  i*.)  Therefore,  anything  we 
can  prove  about  the  class  of  generalized  binomial  random  variables 
yields  a  general  theorem  about  Student's  statistic  under  orthant 
symmetry.  A  simple  but  not  very  useful  example  is  Tchbycheff's 
inequality:  P(|S,J  >  c)  <  — j-  for  every  £,  since  has  mean  0 

C  Tj 

and  variance  1.  Therefore  P(|S  |  >  c)  <  —^  under  orthant  symmetry. 

c 

Morents  are  convenient  to  work  with  here,  since  they  pass 

easily  throug>  the  mixture  process.  For  every  value  of  t , 

2 

ES^  =  1,  ES^  =  0  for  v  odd,  so  that  the  same  statements  hold 
for  S^.  (In  particular,  Sn  has  mean  0  and  variance  1.)  Our  main 
result  is  a  bound  on  the  higher  even  moments  of  Sn« 

Theorem:  Under  orthant  symmetry,  ES^  £S^  for  v  =  4,6,8,..., 
with  equality  if  and  only  if  the  are  identical  independent 
centered  Bernoulli  trials,  X^  =  +c  with  probabilities  y  for 
x  —  1,2,... ,n . 

1  n 

Recall  that  S  =  —  £  where  the  A^  independently  equal 

i"lj  th 

+1  or  -1  with  probabilities  The  theorem  says  that  the  v 

central  moment  of  is  bounded  by  the  corresponding  moment  of 

the  centered  binomial  random  variable,  which  equals  2V  times  the 

moment  of  a  standard  binomial  random  variable  with  n  trials  and 
1 

P  =  J- 

The  theorem  follows  from  the  lemma  by  showing  that  ES^  < 

ESg  for  £  t  e.  A  proof  of  this  statement  is  given  in  the 


s 


appendix  to  this  paper. 

Corollary :  ES^  is  less  than  the  corresponding  moment  of  a  N(0,1) 
random  variable . 

By  the  theorem,  it  is  necessary  and  sufficient  to  prove  this 

for  ES^ ,  which  is  done  in  the  appendix. 

Our  theorem  bounds  the  moments  of  rather  than  the  type  I 

error  probabilities  ?(?,  >  s  ).  In  the  next  section  we  will 

use  the  mixture  lemma  to  develop  an  Edgeworth  expansion  for  the 

distribution  of  S.  The  random  variable  S  has  mean  0  and 

ft  n 

variance  1,  and  differs  from  a  N(0,1)  distribution  by  an  Edgeworth 
sum  whose  leading  term  depends  on  the  kurtosis*  of  Sn, 

kurt(S)  =  ES*J  -  3. 
n  n 

The  next  corollary  provides  some  justification  for  the  statement 
in  the  summary  that  Student's  test  tends  to  behave  conservatively 
(smaller  than  nominal  a  level)  under  orthant  symmetry. 

Corollary:  Sn  has  negative  kurtosis  under  orthant  symmetry.  Under 

the  additional  assumption  that  the  variables  U.  are  exchangable 
(index  symmetry), 

kurtCS  )  =  -2nEU?. 
n  l 

We  calculate  directly 

4  5  4 

esJ  =  3  -  2 ^  q, 

and  therefore  ES^  =  3  -  2Ej  C^  =  3-  2nEU^  under  exchangeability. 

*Many  writers  call  this  the  "coefficient  of  excess"  rather  than 
the  kurtosis. 
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Note  that  U  will  have  both  symmetries  if  the  random  variables 
are  independent  and  identically  distributed  symmetrically  about 
the  origin. 

Negative  kurtosis  tends  to  give  a  distribution  that  has 
smaller  than  the  N(0,1)  probability  of  exceeding  any  constant  s 
greater  than  /T.  The  tabled  values  of  sn  q ,  derived  from  normal 
theory,  are  close  to  the  a  points  for  a  N(0,1)  distribution. 

These  two  facts  together  would  indicate  that  P(S  >  s  )  <  a 
under  orthant  symmetry  for  the  usual  values  of  a.  This  statement 
is  not  actually  true  in  general,  but  the  violations  of  the  a-level 
seem  to  be  slight,  particularly  in  the  case  of  i.i.d.  observations. 
We  will  examine  this  phenomena  more  closely  the  next  two  sections. 

R.  R.  Bahadur  and  J.  Eaton  have  communicated  the  following 
interesting  bound  on  P(Sn  >  s),  and  have  been  kind  enough  to 
allow  me  to  include  it  in  this  paper: 

Theorem  (Bahadur  and  Eaton):  Under  orthant  symmetry,  P(S„  >  s)  < 

1s2 

Ts 

e 

The  px'oof  follows  from  the„mixture  lemma  and  the  fact  that 
.  .  -1/2  s2 

P(S5  1  s)  i  Ees^  <_  e 

4 .  Edgeworth  Expansion  For  Student’s  Statistic. 

We  can  obtain  an  expansion  for  the  c.d.f.  of  Sn  from  the 
mixture  lemma  in  the  following  way:  we  expand  the  generalized 
binomial  c.d.f.  in  a  standard  Edgeworth  series  ([2],  Chapter  17), 

P(S5  <  s)  =  *(s)  +  k4U)*(4)(s)  +  k6U)$(6)(s)  +  ...» 

(  4  ) 

where  *  is  the  standard  N(0,1)  c.d.f.,  ♦  (s)  its  fourth 
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derivative,  etc.  The  constants  k^  depend  on  5 ,  and  hence  on  n, 
and  vanish  for  odd  values  of  j  by  symmetry.  The  second  corollary 
of  Section  3  guarantees  that  kl+(5)  =  kurt(S^)  is  negative. 

The  mixture  lemma  now  yields 

P(Sn  <  s)  =  ♦(s)  +  EkuU)*(4)(s)  +  Ek6U>*(6)(s)  +  ...  , 
where  the  expectations  are  with  respect  to  x*, 

Ek.(e)  =  f  k.U)dA+U> 

3  J  +  3  n 

S 

n 

(recall  that  5.  =  |U±  1  =  |  X±  |  /j!  X  H  for  i  =  1,2,. ..,n).  These 
expectations  become  particularly  simple  when  U  has  index  symmetry 
(exchangeable  coordinates)  as  well  as  orthant  symmetry. 

.Edgeworth  Expansion:  assuming  that  U  =  X/||X||  has  orthant  and 
index  symmetry, 

P(Sn  <  s)  =  ♦ ( s )  -  (T|(EnuJ)i(4)(s)} 

+  {^(EnU8)»(6)(s)  +  2T8(EnUl  +  En(n-l)U4U4)^(8)(s)} 
+  ... 

(See  the  appendix  for  the  derivation  of  this  formula.)  Here  the 
first  bracketed  term  comes  from  the  term  in  the  Edgeworth 
series  for  S^,  while  the  second  bracketed  term  is  The 

quotation  marks  are  necessary  since,  as  we  shall  see  in  Section  5, 
these  terms  will  approach  non-zero  limits  if  the  have  long  tails. 

The  usefulness  of  an  asymptotic  ( non- convergent )  series,  such 
as  the  expansion  given  above,  can  usually  be  determined  only  by 
experience.  Cramer  suggests  in  [2]  that  the  Edgeworth  expansion 
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not  be  i'sed  beyond  the  first  correction  term.  Since  in  our  case 
the  tfc^m  is  either  negligible  or  positive  for  values  of  s  in  the 
usual  testing  range,  say  1.5  s  2.5,  the  simple  approximation 
of  Sn's  c.d.f.  by  $(s)  would  seem  to  err  usually  in  the  conserva¬ 
tive  direction. 


A  simple  test  case  for  the  expansion  formula  is  that  where 
the  are  independent  N(0,1),  i.e.  in  the  case  of  the  genuine 
Student’s  distribution. 


.50 

.75 

1.00 

1.50  2.00 

.50 

1.00 

2.00 

2.50 

Actual  Value  P(S  <s) 

n 

.665 

.743 

.813 

.928  .992 

.675 

.824 

.984 

.999 

«(s) 

.692 

.773 

.841 

.933  .977 

.692 

.841 

.977 

.994 

$  ( s  )  ~T?( EnU^ )#^4^(s) 

.674 

.754 

.824 

.928  .981 

.679 

1 

.829 

.980 

.997 

i 

J 

: 

~  "  V- - 

n=5 

n=8 

(4 

degrees  of 

freedom) 

(7  degrees 

of  freedom) 

Chung  [1]  has  given  an  expansion  for  S^  directly  from  the 
moments  of  the  X^,  rather  than  through  the  normalized  vector  U. 

His  expansion  does  not  require  orthant  symmetry.  On  the  other  hand, 
it  does  require  the  existence  of  higher  order  moments  of  the  X^. 
while  the  formula  given  here  does  not.  We  can  therefore  apply  our 
expansion  to  such  interesting  cases  as  the  X^  Cauchy.  We 
discuss  long- tailed  error  laws  in  some  detail  in  the  next  section. 

5 .  Long-tailed  Error  Laws. 

Looking  again  at  Figure  1,  let  us  imagine  computing  the  c.d.f. 

F^(s)  =  P(S?  <  s) 
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of  the  generalized  binomial  random  variable  (defined  in 

Section  3)  for  each  value  of  £  in  S+.  If  n  is  even  moderately 

large,  say  n  _>  10,  F^  will  be  reasonably  well  approximated  by  the 

N(0,1)  c.d.f.  ♦  >r  values  of  s  in  the  usual  testing  range,  as 

long  as  K,  is  near  che  main  diagonal  point  e  =  (—,...,—).  The 

/n  /n 

central  limit  theorem  will  fail  more  and  more  drastically  as  £ 

approaches  the  corners*  of  S*,  which  is  to  say  as  the  components 

of  £  become  more  unequal  in  magnitude.  The  extreme  case  is 

i  =  (1,0 ,0 , . . . ,0) ,  which  yields  =  +1  with  probabilities  i. 

As  we  have  discussed,  the  deviations  of  F  from  *  will  always 

6  ?  4 

be  in  the  platykurtic  direction,  kurtosis  (S  )  =  -2  £  with 

6  i=l  1 

a  general  tendency  for  F^(zq)  to  exceed  the  nominal  value 
♦(zo)  =  l-o  for  the  customary  a  values.  (Computer  experimentation 
has  shown  that  for  small  values  of  n  this  tendency  to  err  in  the 
conservative  direction  is  more  drastic  than  indicated  by  the 


Edgeworth  corre  ct ion . ) 

Now  let  us  consider  the  case  where  the  are  independent  and 
identically  distributed  random  variables ,  symmetric  about  0 .  If 


the  X.  have  finite  variance,  then  writing  S  as 

!h/R 


shows  that  Sn  is  asymptotically  N(0,1),  since  the  numerator 

2 

approaches  N(0,o  )  by  th2  central  limit  theorem  while  the 


*It  should  be  remembered  that  in  higher-dimensional  space  there  are 
"zero-dimensional  corners",  "one-dimensional  corners",  "two- 
dimensional  corners",  etc^  We  use  "comer"  here  for  a  low 
dimensional  boundary  of  S^,  in  a  sense  which  will  be  made  explicit. 
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denominator  approaches  a  by  the  law  of  large  numbers.  In  terms 

of  our  picture,  this  means  that  the  mixing  distribution  A*  must, 

for  large  n,  put  nearly  all  of  its  mass  in  that  portion  of  S* 

sufficiently  near  e  for  the  central  limit  theorem  to  yield  good 

approximations . * 

2 

If  o  is  infinite  our  derivation  of  limiting  normality  for 

S  fails  in  both  the  numerator  and  denominator.  Intuitively,  we 
n 

2 

expect  that  if  the  X^  have  a  long-tailed  error  law,  o  =  «=>,  then 

A+  will  put  much  more  of  its  mass  near  the  corners  of  S+.  The 
n  n 

very  term  "long-tailed"  implies  occasional  freakishly  large 
values  for  the  X^,  which  result  in  £  vectors  near  these  corners. 

From  the  mixture  lemma  we  then  expect  Sn  to  have  a  much  more 
platykurtic  distribution,  the  most  extreme  case  being  kurt(Sn>  =  -2 
if  £  always  has  only  one  non-zero  component. 

Two  asymptotic  results  supporting  these  intuitive  arguments 
can  be  reported  for  the  case  where  the  X^  have  a  stable  distribu¬ 
tion  law  of  order  a,  0  <  a  <  2.  (We  are  still  assuming  that  the 

X^  are  i.i.d.  and  symmetric  about  0.)  Darling  [3]  shows  that,  in 

1 

- 

max  £ . 
l<i<n 

lim  E[  max  4?)  >  1  -  We  cannot  expect  limiting  normalitv, 
therefore,  since  A*  must  give  high  probability  to  £  vectors  whose 

*In  the  case  of  the  uniform  distribution  A*=A+,  for  example,  it  is 

n  +  - 

easy  to  prove  that  for  any  e,6  >  0  we  have  A  {£: sup  |F^(s)— ♦Cs) |>e} 
<6  for  all  n  sufficiently  large. 


|  -*■  l/(l-y)  as  n  ♦  which  implies 


our  notation,  Ej 
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components  are  not  uniformly  small  (UAN).  In  an  unpublished 
paper  [14],  lx>gan,  Mallows,  Rice,  and  Shepp*  show  that  Sn  must 
approach  a  limiting  law  when  the  are  i.i.d.  stable  variates, 
and  give  an  integral  expression  for  the  characteristic  function 
of  the  limit.  This  law  has  kurtosis  =  -2  +  o,  so  as  a  goes  to 
0  we  approach  the  degenerate  case . 

We  conclude  this  section  with  an  example  of  a  long-tailed 
error  law  very  closely  related  to  the  normal  law.  This  example 
has  the  advantage  of  easy  calculation  of  **  for  any  value  of  n, 
and  is  helpful  in  picturing  many  of  the  bizarre  sampling  effects 
of  long-tailed  distributions,  such  as  Darling's  result  above. 

We  let 


X  =  -I 
1  £ 

i 


for  i  =  1,2,. . .  ,n. 


where  the  X^  are  independent  N(0,1)  random  variables.  Then  X^ 
has  the  density  1 

f(x.)  =  , 

1  /27  Xf 

and  has  Cauchy-like  tail  behavior,  being  attracted  to  a  stable 

2 

law  of  order  a  =  1.  (The  square  =  X^  has  density 

f(Z.)  =  —  zJ  e  , 

1  /57  1 

which  is  exactly  the  positive  stable  law  of  order  j,  a  fact  we  use 
below.  See  [5],  page  170.) 


*1  am  grateful  to  the  authors  for  allowing  me  to  report  these  results. 


< 
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Consider  the  mapping  X  =  g(X)  which  takes  vectors  into 


,n. 


vectors  by  inverting  each  component,  X^  =  — ,  i  =  1,2,. 

A/  ^ 

Since  g(cX)  =  —  g(X)  for  an y  c  >  0,  this  mapping  induces  a  mapping 
on  rays  in  En,  and  in  particular  induces  a  mapping  of  S*  onto 

itself,  say  £  =  g*(£).  (For  example,  if  n  =  3  and  It  =  (2,1,2), 

'■*'919  IT  +  —  191 

so  £  =  (t*-,-*-,-*-,  ) ,  then  g(X)  -  (y,l,-y)  and  g  ( £ )  -  ( — , — , — ).) 

*  *  /6/6/6 

We  see  that  the  distribution  A+  induced  on  S+  by  taking  the 
X^  to  be  inverted  normals,  as  defined  above,  is  obtained  by 
"inverting"  the  uniform  distribution  A *  via  g^. 


A*{g+(A)}  =  A (A) 


for  A  C.  Sn, 


^  +  4. 

(since  X  itself  induces.!  ).  By  putting  coordinates  on  S  we 

<sx!u> 


can  easily  calculate 


— 2 -  for  any  value  of  £ ,  from  the  proper- 

dA  {£} 


ties  of  g  .  (This  is  done  in  the  appendix.)  Let  us  just  note 
dA*(e ) 

here  that  — ^ -  =  1,  which  is  not  surprising  since  e  is  the  fixed 

{e}  dA*{£} 

point  of  g  ,  and  more  importantly,  — - -  -*•  *»  as  £  approaches  any 

♦  n  dX  U}  ♦ 

comer  of  Sn  of  dimension*  d  <  j  -  1.  This  is  clear  since  g  maps 

any  boundary  line  of  into  the  opposite  comer.  Roughly  speaking, 

g+  maps  points  £  near  high  dimensional  boundaries  of  S* ,  where 

is  nearly  normal,  into  points  £  near  low  dimensional  corners  of 

S*,  where  tends  to  be  non-normal  in  the  platykurtic  sense. 

The  formula  for  the  kurtosis  of  Sn  given  in  Section  3  can  be 

2 

evaluated  explicitly  here  by  making  use  of  the  fact  that  X^  is 
stable  of  order 


*In  the  case  n  -  2,  A*  =  A+  and  there  are  no  poles. 
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kurt(S  ) 
n 


7 

t - \ - -_]2  de 

0  l+(n-l)  tan  e 


s  ‘1  “  n^I  +  0(^T} * 


The  approach  to  the  limiting  value  -1  given  by  Logan  et.  al.  is 
seen  to  be  rather  rapid.  (Details  given  in  appendix.) 


6 .  The  Sign  Test  and  Wilcoxon's  Signed  Rank  Test. 

Orthant  symmetry  of  the  vector  U  =  X/|jX||  is  preserved  under 
mar*  amil.iar  statistical  operations.  In  general  we  can  define 

U  =  g(U) 


by  simply  specifying  that  g  map  every  orthant  into  itself  in  a 
manner  defined  by  a  mapping 

?  =  g+U> 


of  S*  into  itself.  If  U  has  orthant  symmetry  with  measure  A+  on 
n  J  n 

Sn,  then  the  mapped  vector  U  will  also  have  orthant  symmetry  with 
induced  measure 


^{g+(A)>  =  X*{A} 


for  every  ACS*.  The  theorems  and  heuristics  of 
sections  then  apply  as  well  to  the  statistic 

n  „ 


Sn  =  X  u< 


i=l 


the  previous 


n 

as  to  the  original  Student's  statistic  S  =  \  U - .  To  emphasize 

n  i=l  1 

this  point,  we  will  call  SR  a  generalized  Student's  statistic. 
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Generalized  Student’s  statistics  include  most  of  the  common 
nonparametric  tests  for  the  one  sample  problem.  For  instance,  if 
g+U>  =  e,  that  is  g  maps  U  into  — ( A^ ,a2 » . . .  ,An>  where  A^  =  sign(lK), 
then  is  simply  the  "sign  test"  for  symmetry  about  the  origin. 

If 

g  (C)  =  \lnCnhil  (R1»R2»***  ^n5  ’ 

where  R^  is  the  rank  of  6^  among  (the  smallest 

<v 

having  rank  1),  then  is  Wilcoxon's  signed  rank  test. 

The  effect  of  these  trams format ions  is  to  move  £  away  from 
the  corners  of  s\  For  the  two  transformations  given  above,  it 
is  easy  to  see  that  we  move  close  enough  to  the  center  point  e 
so  that  limiting  normality  is  guaranteed  under  orthant  symmetry*. 

A  reasonable  question  at  this  point  is  "why  worry  about 
limiting  normality  if  the  type  I  errors  tend  to  be  in  the  conser¬ 
vative  direction  in  any  case?"  The  answer,  of  course,  is  that  we 
are  also  interested  in  the  power  of  the  test,  which  for  the  unmodi¬ 
fied  t-test  may  be  nil  in  long-tailed  cases.  Power  considerations 
are  discussed,  in  an  abbreviated  manner,  in  the  next  two  sections. 

It  should  be  noticed  that  the  question  of  power  involves  a 
property  of  the  X  vector  which  we  have  ignored  up  until  now — 
namely  its  length,  }|x|j.  To  see  this,  consider  what  happens  to 
the  distribution  of  the  generalized  binomial  if  we  move  the 
center  of  orthant  symmetry  from  the  origin  to  the  point 

*For  the  sign  test  this  is  immediate,  while  for  the  signed  rank 
test  it  follows  from  the  Lindeberg  condition,  C 5 D ,  p.  256. 


* 
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C  /n* /n* *  *  *  *  Sn^ *  6  >  i • e •  we  undergo  a  component-wise  trans¬ 

lation  to  the  right.  The  rough  effect  is  to  translate  the  distri- 

/n6 

bution  of  S_  a  random  amount  — = —  to  the  right,  where  L.  has  the 

....  «  5 

conditional  distribution  of  f|X|j  given  £,  (calculated  for  the  case 

6  =  0).  In  the  normal  case  X.  N(  — ,i),  L-  has  a  distri- 

/n 

bution,  independent  of  £,  and  all  the  distributions  translate 
in  the  same  way.  In  general,  this  will  not  be  the  case.  For  trans¬ 
lations  of  the  inverted  normals  of  Section  5,  for  instance,  it  is 


easy  to  show  that  ^  c(£)/  where  c(£) 


-  fi  ^ . 

vi=l  d 


Thus  the 


translation  effect  on  S^,  which  yields  the  power  of  the  test,  is 
largest  for  £  =  e,  and  decreases  as  indicated  as  £  moves  away 
from  e. 

7.  Legalized  Cheating  for  Increased  Power:  Winsorization. 

The  two  examples  of  generalized  Student’s  statistics  given 
ir.  Section  6  relate  to  rank  tests  for  the  cne-sample  problem. 
Other  familiar  statistical  operations  can  be  discussed  from  this 
point  of  view.  The  example  of  this  section  relates  permutation 
tests  to  Winsorization  via  generalized  Student’s  statistics. 
Consider  the  problem  of  testing  the  null  hypothesis 

Hq  :  X^  i.i.d.  random  variables,  symmetric  about  0 


versus  the  specific  alternative 


-rix<-»i 


:  X^  i.i.d.-  with  density  e 


a  1  i 


,  u  >  0. 


t 
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It  is  well  known  that  if  we  want  a  genuine  level  a  test  for  , 
we  must  content  ourselves  with  a  permutation  test  (see  [7  3,  page 
2Q3,  problem  11).  That  is,  we  must  condition  on 


Y  =  <Y1,Y2,...,Yn>, 


yi  =  lXi' 


and  for  each  Y  choose  100a%  of  the  2n  possible  vectors 
X  =  (61Y1,62Y2,. .. »finYn)»  =  ^1*  as  rejection  points. 

Just  as  in  the  more  familiar  two-sample  permutation  test 
([7  3,  page  175),  we  maximize  the  conditional  power  of  the  test  if 
we  reject  for  those  X  with  the  maximum  probability  density  under 


1,2,. 


,n. 


Hi» 


5. 


f„  (X)  =  (2o ) 
H1 


i=l 


Since  we  can  write 
n 


n 


l  l«iYi-wl  =  l  (Y.-Y^3)  ♦  nw  -  l  6iYlUJ  , 
i=l  1  1  i=l  1  1  i=l  1  x 

i"3  ■ 

choices  of  6  =  ( & 1  ,«2  , . . . , $n)  maximizing 

,Cp3 


n 


,Cu3 


where  Y^HJ  =  min  (Y^,p),  it  is  equivalent  to  reject  for  those 


n 


I  6  -  Y. 
i=l  1  1 


Now  suppose  we  do  not  know  the  correct  value  of  p,  so  we 
"cheat”  by  first  looking  at  the  values  of  Y^ ,Y2 , . . . ,Yn,  and 
calculating  some  scale  invariant  estimate  of  p,  say  p(Y), 


u(cY)  =  cp'Y) 


for  all  c  >  0. 


[An  example  would  be  to  choose  p  to  maximize  the  number  of  observed 
f  points  in  p  ♦  ejjYll,  where  c  is  a  positive  constant;  i.e.  choose 
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y  to  be  a  modal  point  of  the  Y..  's.  Or  we  could  use  maximum  like- 


a  r\  n  *™i  Y  • -y  | 

lihood  estimation,  choosing  (y,o)  to  maximize  n  ^“(e 
,  i=l 

”lYi+Ml 

+  i  0  1  ).] 

The  mapping  which  takes  Y  into  Y^v^  =  (Y^w  ^  jY^  ^  •  »Y^y  ^ ) 

r A  i 

takes  rays  into  rays,  that  is  it  takes  cY  into  cYLyJ  for  any 
c  >  0.  It  therefore  induces  a  napping  on  S*  of  the  form  discussed 


in  Section  6: 


T  =  g*(t)  =  ttu(t)]/|?[u(e)3| 


This  is  turn  induces  a  mapping  U  =  gCU)  on  all  of  by  copying 

^  _ 

the  map  g  in  each  of  the  2  other  orthants.  The  corresponding 
generalized  student's  test,  which  rejects  for 

n 


i=l 


U.  >  sr 

l  C  ,<* 


is  seen  to  be  am  approximation  to  the  most  powerful  level  a  test 

for  Hq  versus  .  Theoretically  s£  Q  should  be  chosen  to  give 

n  ^  ^ ,a 

a*2n  values  of  7  6.£.,6.=+",  greater  than  If-  ,  but  from  our 

1  i  i  —  ’  &  t,a’ 

previous  discussion  of  generalized  binomials  we  feel  safe  in 
choosing  a  =  sn  0»  *he  uPPer  <*  point  of  under  normality,  or 

even  more  simply  Hie  s  z  ,  the  upper  a  point  of  a  N(Q,1)  random 
variable.  (Note  that  the  mapping  £  =  g(S)  once  again  moves  us 
away  from  the  corners  of  S*. )  We  know  that  the  generalized 
Student's  test  we  have  constructed  will  have  approximate  size  a 
for  the  null  hypothesis  of  orthant  symmetry,  which  includes  HQ, 
and  if  the  estimate  y  of  u  is  at  all  accurate,  it  should  have  good 
power  under 
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There  is  nothing  particularly  compelling  about  the  choice 
of  a  double  exponential  distribution  for  the  X^,  except  that  it 
leads  to  a  Student's  test  based  on  Winsorized  values  of  the 
observations,  rather  than  the  raw  values.  What  is  striking, 
though,  is  how  little  one  may  deviate  from  the  normal  translation 
model  without  inducing  drastic  changes  in  the  form  of  the  appro¬ 
priate  test  statistic  (c.f.  [12]). 

In  general,  suppose  we  are  testing  HQ  versus 

1  Xi-»* 

H.  :  X.  i.i.d.  with  density  —  f( - ),  y  >  0. 

XI  '  0  0 

Defining  =  |X^|,  i  =  1,2,...  „n,  as  before,  the  most  powerful 

level  o  test  is  a  permutation  test  which  rejects  for  the  100  % 

n  _ 

largest  values  of  £  6.Y.,  6.  =  +1,  where 

i=i  11  1  v 

Yi-n 

_  f(— = ) 

Yi  =  ^  log  — ,  i  =  1 ,2  , . . .  ,n. 

f(_i — ) 

o 

If  we  do  not  knew  the  parameters  y  and  o,  we  can  estimate  them 
from  the  absolute  values  Y^ ,Y£  , . . .  ,Yr  in  any  way  we  want,  subject 
only  to  the  restriction  that  the  resulting  mapping  Y  Y  takes 
rays  into  rays  (maximum  likelihood  estimation  will  always  have 
this  property).  As  in  the  exponential  case,  we  are  led  to  a 
generalized  Student's  test  which  rejects  for  large  values  of  Sn* 
The  mapping  £  =  g  (O  which  determines  which  generalized  Student's 


test  we  use  is  given  by 


■i  *  7  1o« 


f(T-> 

“^r  • 


i  -  1,2 , . . . ,n. 
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A  \ 

where  (y,o)  is  the  estimate  of  y  and  a  given  that  Y  =  £.  We  have 
not  actually  cheated  on  our  o-level,  since  mappings  based  on  Y 
preserve  orthant  symmetry  under  the  null  hypothesis. 

If  f  is  the  normal  density  then  the  procedure  above  yields 
the  ordinary  Student's  test,  but  with  other  even  slightly 
different  kernals,  far  different  tests  are  called  for.  This 
approach  is  not  limited  to  translation  and  scale  parameter 
families ,  but  the  author  has  not  investigated  the  more  interesting 
problem  of  obtaining  useful  estimates  of  f  from  the  absolute 
values  in  general  situations. 


8.  Simultaneous  Use  of  Student's  Test  and  The  Sign  Test. 

Another  approach  to  safeguarding  the  power  of  a  one-sample 

test  is  to  use  more  than  one  test  on  the  data.  For  instance,  we 

n 

might  use  Student's  statistic  S n  -  1  in  conjunction  with  the 

sign  1 3st  S  =  l  Sign(U.)/n.  In  the  language  of  Section  6,  we 

n  i_l  i 

would  be  simultaneously  using  two  generalized  Student's  statistics, 
one  based  on  K  itself,  the  other  on  5  =  e . 

If  ?  is  any  generalized  Student's  statistic,  based  on  the 
mapping  T  =  g+(£),  then  the  vector  (Sn,Sn)  can  be  expressed  as 


(Sn’V  =  AiUi’€i) 

where,  as  before,  =  Sign(X^),  i  =  l,2,...,n.  Conditioning  on 
the  value  of  £,  the  bs  are  independently  +1  with  probabilities  j 
under  the  null  hypothesis  of  orthant  symmetry.  Conditionally,  the 
random  vector  will  have  mean  (0,0)  and  covariance  matrix 

1  r 
r  1 


I 
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n  ^  + 

where  v  -  £  S  •  =  s  *g  (  £)  •  In  the  case  of  the  sign  test, 

i^l  11 


n 


v  =  7  £. //n.  By  the  central  limit  theorem, (S  ,S  )  will  have 

i=l  1  n  n 

approximately,  a  normal  distribution 


(Sn,Sn)~  N((0,0),£  JV 


) . 


the  approximation  holding  best  for  £  and  £  near  e. 

Now  if  we  wish  to  use  both  and  simultaneously  on  the 
same  set  of  data,  we  can  accept  the  normal  approximation  as  being 
sufficiently  accurate,  and  base  our  decision  on 


Sn  =  ““‘W 

whose  distribution  can  be  read  out  of  standard  bivariate  normal 
tables.  Here  is  a  small  table  for  the  approximate  upper  5%  point 

A 

of  S  : 
n 


n 


r  =  l  *i«i 

i=l  1  1 

.50 

.55 

.60 

.65 

.70 

.75  .80  .85 

.90 

.95 

1.00 

Approximate 

5%  Point 

1.91 

1.91 

1.90 

1.89 

1.88 

1.86  1.85  1.83 

1.80 

1.76 

1.65 

These  numbers 

should 

be  compared 

with 

1.96,  the  upper  5% 

point 

if 

you  use  the  usual  bound  P(S  <  s)  >  1-P(S  >  s)  -  P(S^  >  s),  and 

1.65,  the  upper  5%  point  if  you  perform  either  one  of  the  tests 
seperately . 

The  value  of  rj  depends  only  on  C,  and  so,  as  in  the  last 

section,  we  can  compute  it  before  we  decide  whether  or  not  we  want 

to  use  a  simultaneous  test.  In  the  case  where  Sn  is  the  sign  test, 
n 

r  =  y  £.//n,  the  computed  value  of  r  should  ordinarly  be  auite 
i  =  l  1 


25 


large.  If  the  X.  are  i.i.d.  with  X.~  Xn  +  —  for  i  =  l,2,...,n, 

/n 

where  Xq  has  a  finite  second  moment  and  is  symmetrically  distributed 

about  the  origin,  then  r  will  approach  in  probability  the  constant 

E|X0|/^“  as  n  goes  to  infinity.  If  XQ~  N(0,o2)  this  limit  is 

/277  =  .798,  while  if  Xq  is  double  exponential  the  limit  equals 

—  =  .707.  The  variance  of  r  in  the  case  of  normal  components  is 

about  •1^-.  If  the  computed  value  of  r  is  not  large ,  we  have  a 

strong  indication  of  non-normality,  and  it  is  probably  best  not 

to  use  S  at  all. 
n 

The  normal  approximation  to  the  conditional  joint  distribution 

of  (S  ,S  )  given  £  matches  exactly  the  first  and  second  moments 
xi  n 

(i.e.  the  mean  vector  and  covariance  matrix),  and  is  conservative 
with  respect  to  the  higher  moments  exactly  as  in  the  theorem  of 
Section  3:  we  can  consider  the  general  case  of  k  simultaneous 
generalized  Student's  statistics, 


5=  (Sn(l),s'n(2),...  ,Sn(k))  =  l  Ai(5i(l),ri(2),...,Ti(k)), 

where  £  determines  the  k  vectors  £(j)  via  £(j)  =  g.(£),  and  given 
£,  the  A^  are  independently  +1  or  -1  with  probabilities  y  as  before, 
S  has  conditional  mean  vector  (0,0,..., 0)  and  covariance 
matrix  i.  =  [£*( j . ) * £( j 9 )  ].  .  _  -  «  v.  For  any  vector 

V  =  (Vl,V2,...,Vk),  we  then  have 

V 

E(V-S)v  <  (V^V*)7  ENV(0,1) 


for  v  =  4,6,8,...  .  (This  follows  from  Section  3  by  noting  that 
V*S  is  itself  a  generalized  binomial  scaled  by  a  factor  (V£V)  .) 

The  expectation  here  is  conditional  with  respect  to  the 
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observed  value  of  £.  The  inequality  may  not  hold  with  respect 
to  the  unconditional  distribution  of  S,  which  has  mean  vector  0 
and  covariance  matrix  t  =  E$^.  This  brings  up  an  interesting 
point:  there  is  no  particular  reason  to  approximate  the  uncon- 
ditional  distribution  of  S  with  a  multivariate  normal,  since  it 
is  really  a  mixture  of  such  approximations  with  different 
covariance  matrices .  Asymptotic  normality  of  S  comes  from  the 
fact  that  under  certain  conditions  will  go  to  a  limiting  matrix 
in  probability  as  n  gets  large.  However,  for  moderate  n  it  seems 
more  sensible  to  work  directly  with  the  conditional  distribution, 
which  is  a  fortiori  approximately  normal.  This  point  is  made  more 
emphatically  in  the  next  section. 

9 .  Conditional  Versus  Unconditional  Distribution:  Angle  From  an 
Arbitrary  Vector. 

So  far  we  have  been  able  to  gloss  over  the  distinction  between 

applying  the  generalized  Student’s  tests  conditionally  (conditional 

on  O  as  opposed  to  unconditionally.  This  was  primarily  because 

the  conditional  random  variable  S^  had  the  same  mean  and  variance 

for  all  values  of  £.  We  can  destroy  the  pleasant  situation,  and 

further  explore  the  nature  of  the  approximations  we  have  been 

using,  by  considering  the  random  angle  between  X  and  an  arbitrary 

fixed  vector  c  t  S+,  c  i  e.  Let  8  be  this  angle,  and  define 

n  x  j  c 

Sn(c)  =  /n  Cos  eXjC=  ■'n  cOK, 

where  U  =  X/||X|!  as  before  (recalling  that  =  /n  Cox  9X  e^* 

Conditioning  on  Z  =  { |U^ \ , }U2 J  , . . .  ,  jUn| ) ,  this  can  be  written 
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where 


n  2  2  ^  'V 


:<C)  =  /nJ  i°*'* 


~  _  CiCi 

‘'fS 


i  =  1 ,2  , . . .  ,n , 


and  under  orthant  symmetry  the  A.  equal  +1  or  -1  independently 

with  probabilities  y.  The  sum  Sr  =  ji  A.?,  is  a  generalized 

1  K  i=l  1  1 

binomial  random  variable,  as  defined  in  Section  3.  and  we  see 
that  the  mixture  lemma  of  that  section  takes  the  following  form 
in  our  present  situation: 


If  n  is  moderately  large  and  the  components  are  not  too 
drastically  different  from  one  another,  the  conditional  distribution 

n  2  o 

of  S^(c)  cam  be  well  approximated  by  a  N(0,n£c^tf)  distribution. 

The  condixional  variance  has  expected  value  1  over  all  realizations 

of  £.,  but  in  a  testing  situation  we  might  prefer  to  work  directly 

2  2 

with  the  conditional  value  n£c.E. ,  particularly  since  we  will 

1 

usually  be  unable  to  approximate  the  unconditional  distribution  of 
Sn<c),  except  indirectly  via  the  mixture  lemma.  Thus  we  may  have 
asymptotic  normality  for  S  (c),  but  this  will  derive  from  the  more 

/  n  2  2 

direct  limiting  normality  of  the  S  (c)  and  the  fact  that  n  \  c .£• 

K  V  i=l  1  1 

approaches  a  constant  as  n  grows  large.  The  moments  theorem, 

n  _  _  v/2 

ESAc)  <  (n£cfd)  ENV(0,1)  , 

4  i  1  1 


< 


-  28  - 


for  v  =  4,6,8,...,  may  not  hold  for  Sn(c). 

Let  us  consider  a  hypothetical  example:  suppose  we  wish  to 
test  Hn  :  X.  x~  N(C,o2)  vs  H,  :  X.  N( Bi ,o 2 ) ,  6  >  0, 

U  X  XX 

i  =  l,2,...,n  (a  ''regression  alternative"),  so  that  the  UMP(a) 
test  is  to  reject  for  large  values  of  Sn(c  ,  c  =  ^-^-jy(l,2 ,3 , . . .  ,n) , 
We  observe  S  (c)  =  1.75,  which  is  at  about  the  .04  level  of  the 

rm 

unconditional  distribution.  However,  we  compute  Jn'2.c.E.  =  1.5,  so 

»  1  1  1 

the  significance  level  in  the  conditional  distribution  is  only 
about  .13.  If  we  have  a  great  deal  of  confidence  in  our  normal 
model  we  will  probably  believe  the  .04  significance.  However, 

I  2  2 

the  size  of  ,/n2.c-C*  already  points  to  some  abnormality  in  the  data, 

»  1  1  1 

and  we  will  be  a  good  deal  safer  if  we  follow  the  conditional 

* 

inference  . 

There  are ,  of  course ,  ways  we  can  retreat  part  way  from  the 
full  normal  hypothesis,  without  going  all  the  way  to  the  test 
based  on  orthant  symmetry.  We  could,  for  instance,  take  HQ  to  be 
"the  X^  are  i.i.d.  symmetric  about  the  origin",  and  test  conditionally 
given  the  order  statistic  of  £,  <_  £[2]  i  •••  1  ^[n3*  Under 

Hq ,  the  resulting  statistic  will  be  an  equally  weighted  mixture 
of  n!  scaled  generalized  binomials,  corresponding  to  taking  all  n! 
permutations  of  the  order  statistic  to  give  different  £  vectors. 

The  scaling  factors  /n^ct^ . . .  average  to  no  more  than  unity. 


1  v  p  2.2  , 

— r  )  n)c. £  ...  <  1, 
n!  L  i  ?  i  *(i)  —  ’ 

*  \  1 


Metaphysical  statements  of  preference  between  the  two  modes  of 
inference  abound  in  the  literature,  but  no  compelling  criterion 
of  selection  seems  to  exist  at  present. 
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by  the  concavity  of  the  square  root  function.  Therefore,  we 
might  feel  that  a  N(0,1)  approximation  to  the  statistic  would 
tend  to  be  conservative.  However,  in  this  case  it  is  not 
necessarily  true  that  the  higher  moments  of  the  statistic  will 
be  bounded  by  those  of  a  N(0,1)  random  variable. 

Other  sampling  characteristics  of  the  angular  distribution 
of  X  cam  be  approximated  from  the  central  limit  theorem.  For 
example,  let  Hc  a  fixed  k-dimensional  subspace  of  En,  determined 
by  the  orthonormal  spanning  vectors  c^,c2 , . . • ,c,  .  The  conditional 

9 

distribution  of  n  Cos  0  given  £,  where  e  is  the  angle 

X,C  X,C 

between  X  and  H  ,  is  approximated  by  £  £-(Ox2(j)»  where  the 

c  t  =  i  j 

2  2  J 

X  (j)  are  independent  x^  random  variables,  and  the  £.(£)  are  the 

eigenvalues  of 

nC'E2C, 

C  =  (c^,C2 » • . . jC^) ,  S  =  the  diagonal  matrix  with  £^,£2,...,£ 

as  diagonal  elements.  These  considerations  are  relevant  to  the 

o 

permutation  distribution  of  Hotelling's  T  ,  which  the  author  will 
consider  in  a  companion  paper. 


10 .  Hotelling's  Paper  and  Other  References. 

This  work  was  stimulated  by  Hotelling's  1961  paper  "The 


Behavior  of  Some  Standard  Statistical  Tests  Under  Non-standard 


Conditions"  tn3.  After  setting  up  the  gecmetrv.  Hotelling  approxi- 

dx+(e) 

mates  the  size  of  the  t-test  by  a  T ~  (in  our  notation).  This 

approximation  requires  C^  the  rejection  set,  to  be  small  enough 

so  that  the  measure  X^  has  close  to  constant  density  over  it, 

which  leads  Hotelling  to  consider  very  small  a  levels,  a  <  — — . 

2n 


i 
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For  this  range  of  a,  he  shows  that  the  size  of  the  t-test  relative 
to  the  nominal  size  may  vary  from  0  to  »,  even  with  i.i.d.  symmetric 
bounded  components.  Since  we  know  that  reasonable  n  and  a  actually 
yield  very  large  sets  q  spreading  over  a  good  portion  of  S n, 
it  is  not  surprising  that  Hotelling’s  results  are  quite  different 
from  those  developed  here. 

By  now  it  is  a  matter  of  some  hubris  to  claim  originality 
for  any  topic  bearing  on  the  t-test.  Many  of  the  topics  presented 
here  have  been  discussed  by  other  authors.  Hoeffdir.g’s  1952  paper 
[10]  is  particularly  relevant.  The  case  of  the  double  exponential 
with  a  translation  parameter,  discussed  in  Section  7,  has  been 
investigated  by  Lehmann  [13],  and  others.  For  an  extensive  review 
and  bibliography  of  Student’s  test  under  non-normal  conditions 
the  reader  is  referred  to  [  9]. 

The  author  is  indebted  to  R.  R.  Bahadur  and  M.  J.  Eaton  of 
Chicago,  and  J.  Hartigan  of  Princeton  for  enlightening  conversation 
and  correspondence  on  the  subject  presented  here. 
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Appendix  of  Mathematical  Proofs 
Section  3:  ES^  <  ES^  for  all  £eS*,  6  i  e,  for  v  =  4,6,8,. 
Proof:  Assume  we  have  proven  the  result  for  the  case  n-i,  and 


/n-1 


write  S  =  cS  +  t  A  ,  where  c  ~  )  C-  « 

K  6  n  n  ,/  V  i 


n-1 


'  _  k  u  ■  n.iw*  • —  w  I  i  w  •  u  #  •  ■ i  w  -  —  )  A  •  £  •  /  C  . 

6  n  n’  ,/  |  i  6  •  l  i 


and  S. 


Using  the  symmetry  about  zero  and  independence  of  and  A^ 
(remember  that  these  calculations  are  conditiona.1  on  £),  we  get 


ES,  cvES^  +  (,)cv‘^^ES 
6  6  2  n  6 


.  +  6 


n 


By  the  induction  hypothesis,  this  expression  will  be  increased  if 
we  change  6^  to  /(l-6^)/(n-l)  for  i  =  1,2,..., n-1,  unless  the  first 
n-1  6^  were  already  equal.  By  applying  the  same  argument  to  the 
last  n-1  6^,  we  see  that  6  =  e  is  the  only  possible  maximum  point 

V  _  4- 

for  ES^  over  the  compact  set 

It  remains  to  verify  the  result  for  the  case  n  =  2.  We  have 

es“  =  |[(e1n2)''  *  Ui-Sj)"]  =  ♦  <T^)V  ♦  (•'t  - 

2 

where  y  =  6-^.  Thus 


dES 


„  ^  =  £[<-7  *  /377)v-1c-l - 1 

4  /7  /JT, 


(fy-  /1-Y  ) 


v  —  1 


(-± 


/7  ^l-y 

which  can  be  written  as 


-)], 


-v-2 


•  ‘CHP11* 


-v-2 

v  /  2 


v-l,  /■  v  —  1  . 


-v-4 


-v  -  4  ■ 


-v/2-2 


/t_v/4_4  _  v / 2 - 2  /y~^v /*  j 


-v/2 


This  is  negative  for  T  >  y  and  positive  for  y  <  y,  showing  the  ES^ 
attains  its  maximum  for  y  =  i,  or  6  =  e. 

C. 
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ESg  <  KNV(0,1)  for  v  =  4,6,8,...  . 

Proof:  N(0,1)  =  \  Z.//n,  Z.  N(0,1),  and  S  =  T  A.// n  where 

1  1  1  e  1  1 

the  A^  are  independent,  A.  :  +1  or  -1  with  probabilities  The 
result  follows  immediately  from  the  fact  that  EZ^  >_  Ea?  for  j 
even,  with  strict  inequality  for  j  >  4. 


Section  4:  Edgeworth  Expansion  for  P(Sn  <  s). 


If  S  =  £  V^//n,  where  the  are  independent  random  variables. 


symmetric  about 


zero,  T  cr  (V.)  -  n,  th 
i=l  1 


then  the  Edgeworth  expansion 


for  P(S  <  s)  tsee  [2],  pp.  221-231)  can  be  written  as 
n 


P(Sn  <  s)  =  *(s)  +  i  ~  ♦(4)(s) 


lfX6 

7F 


6  .  (  6  ) 


«vu'(s>  + 


X4X6  (10), 


*(12)  (s), 


+  ...  . 


Here  xv  is  the  average  v  cumulant. 


xv  = 


I  Xv(Vi) 
1  x 


where  we  recall  that  the  characteristic  function  of  defines 
Xv(V.)  by 

•  X  (v .  ) 

log  *v  <t>  =  l  -V— —  (it)v. 

i  1 

The  superscripts  on  $  indicate  repeated  differentiation.  The 
terms  are  grouped  in  such  a  way  that  the  indicated  orders  of 
ide  in  n  hold  for  the  case  of  the  l.l.d. 
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In  our  case  we  let  V„  .  =  /n  5.  A.,  and  note  that  the 

5,1  l* 


characteristic  function  is  $v  (t)  =  Cos/nC^t.  A  standard  expan- 


n 


sion  of  log  cos  now  yields  x 


5,v 


cvnV/2'l  I  cj  v  =  2,4,6,. 


where 


y+1 

(-1)  2V(2V-1)  „ 

c  -  -  B 

v  v  v/  2 


T+  ^  ?  v  1  1 

(-IT  ( v-1 )  !  2 (— )  [1+— —+—=■+  .  .  .  ] 


3V  5V 


being  the  v/2^*1  Bernoulli  number  (ref  [4  ],  #603.3  and  #47.3). 

If  we  use  the  Edgeworth  expansion  with  the  values  v»  we 

get  an  expansion  for  the  generalized  binomial  probability  P(S^  <  s) 

From  the  mixture  lemma,  P(S  <  s)  =  Er(P(S_  <  s)),  and  we  can 

n  t  5 

take  this  expectation  term  by  term  in  the  expansion.  The  leading 


1  X5,4  A<4) 


correction  term  to  4>(s),  —  .3  (s),  has  expectation 


■  £  ’(4)<s>  f  «4  • 


r  4  4  4 

If  we  assume  exchangeable  components  then  \  E5-  =  nE£,  =  nEU,  . 

1  1 

Proceeding  in  this  way  yields  the  expansion  of  Section  4. 


Section  5 :  Angular  Distribution  For  Vector  With  Inverted 
Normal  Components. 

We  calculate  the  angular  distribution  of  a  random  vector  X 
with  components  X^  =  1/X^,  where  X^  N(0,1).  Rather  than  work 

with  the  coordinates  5^ ,  ,  • • • , Cn  on  wbich  are  redundant  and 

must  be  reduced  to  an  n-1  component  set,  we  calculate  our  densities 
with  respect  to  the  coordinates  v  =  (y2 ^3 > • • • ,yn^ 

y .  =  —  i  *  2 ,3 , . . . ,n, 

1  *1 
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taking  values  in  y+  =  {y  :  y^  >  0 ,  i  =  2,3,...,n)  .  (As  before, 
it  is  sufficient  to  consider  only  the  case  of  positive  observa¬ 
tions  because  of  orthant  symmetry.) 

It  is  easy  to  verify  that  the  normally  distributed  vector 
X  =  (X^,X2 , . . . ,Xn>  yields  a  density 


_n 

f^<y)  =  cn(i  +  y2  +  ...  ♦  y2>2 

n 

for  y  t  V* ,  where  cn  =  2n_"*T  (  j)/ir  ^  (the  "multivariate  Cauchy  dis¬ 
tribution").  The  transformation  X^  =  1/X^,  i  =  l,2,...,n,  induces 
the  transformation  y^  =  1/y^,  i  =  2,3,...,n,  in  V  .  We  see  that 
inverted  normal  components  induce  a  density 

fY(y)  =  c^(n  y?)-1(i  +  i/y2  +  i/y\  *  •  -  ,n 


♦  ^ 


on  V* .  The  Radon-Nikodyn  derivative  of  this  density  with  respect 
to  the  former  is 

f»!>!  - T»)-i/  \* 

2  1  ^l+l/y2+l/y2+...+l/y2  J 

In  particular  the  derivative  at  y  =  (1,1,. ..,1),  or  equivalently 
at  £  =  e,  is  equal  to  1  as  claimed  in  Section  5.  If  we  approach 
the  corner  £  '  (1,0,0,. ..  ,0)  of  by  way  of  vectors  y  =  (e,e,...,e 
e  approaching  0,  the  derivative  goes  to  infinity  as  l/en. 


Calculation  of  Kurtosis  of  for  Inverted  Normal  Components. 


We  have 


kurt (S  )  =  -2nEuIJ 
n  1 

=  -2nE[X?/(X?  +  X2  +  ...  *  X2)]2, 

1  Jl  l  n 
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.  **  /V 


ind 


where  =  1/X\,  X^,  N(0,1).  Using  the  fact  that  X?  is 

positive  stable  law  of  order  i,  this  equals 


-2nE[l  +  (n-l>2X2/X2;T2  =  -2nE[l  +  Cn-D^/X*]' 

-ir/2 


,2»2  ,or2  t2 


•-“S 


[1  +  (n-1 ) 2tan2fl ]~ 1 


0 


the  last  step  following  from  the  fact  that  6  =  tan'1  X^/Xj 
uniformly  distributed  between  0  and  v .  The  substitution 
V  =  (n-l)tan  0  gives 


kurt ( S 


)  =  [ 
n  »vn-l'  ^ 


(1+V2)"2(1  +  C-Xp)2)"1  dV. 
n-l 


The  approximation  (1  +  C-^-.)2)”1  ~  1  _  (-X^_)2  then  gives 


kurt(Sn)  =  =  -1  -  HTT  +  ofn"M  * 


1  \  2  ■ 


-2 


the 

de , 
is 


DOCUMENT  CONTROL  DATA  •  R&D 


(S»curtry  cleaaitizenon  o/  title  body  ol  abstract  and  indexing  annotation  mutt  be  entered  •*-'  +n  the  overall  report  ta  clos^»(tedy 


1  ORlGlNA  TIN  C  ACTIVITY  (Corporate  author) 


\2a  REPORT  SECURITY  C  cA$SIcICA  TlQN 


Department  of  Statistics 
Harvard  University 
_ Cambridge  -  Massachusetts 

3  REPORT  TITLE 


Student* s  t-test  Under  Non-normal  Conditions. 


A  DESCRIPTIVE  NOTES  (Type  oi  report  and  Inclutsve  detea) 

Technical  Report;  May  22,  1958 


5  AUTHORfSj  (Last  name,  itra  name,  inlttal) 

Efron,  Bradley 
i 

6  REPO  RT  DATE 

May  22  *  1968 

J  7  a  TOTAL  NO.  OP  PACES  j  7  b  NO  OF  REFS 

36  i  14 

6 a  contract  or  grant  ho. 

Nonr  1866(37) 

b  PROJECT  NO. 

9  a  ORIGINATOR'S  REPORT  NUM&ER'Sj 

Technical  Report  No.  21 

c  MR-042-097 

d 

|  9  b  OTHER  REPORT  uo(S)  (Any  other  number  a  that  may  be  aaalgned 
!  due  report) 

J  to  A  VA  IL  A81LITY/LIM1TATION  NOTICES 

Distribution  of 

this  document  is  unlimited 

11  SUPPLEMENTARY  NOTES 

!  >2.  SPONSORING  MILITARY  ACTIVITY 

i Logistics  &  Mathematical  Statistics 
f  Branch 

i Office  of  Naval  Research 

13  ABSTRACT 

Department  of  the  Navy 

Washington,  D.C. 

The  size  and  power  of  Student’s  t-test  are  discussed  under 
weaker  than  normal  conditions.  It  is  shown  that  assuming  only 
a  symmetry  condition  for  the  null  hypothesis  leads  to  effective 
bounds  on  the  dispersion  of  the  t-statistic.  (The  symmetry 
condition  is  weak  enough  to  include  all  cases  of  independent  but 
not  necessarily  identically  distributed  observations,  each 
symmetric  about  the  origin.)  The  connection  between  Student's 
test  and  the  usual  non-parametric  tests  is  examined,  as  well  as 
power  considerations  involving  Winsorization  and  permutation 
tests.  Simultaneous  use  of  different  one-sample  tests  is  also 
discussed. 


L 


Unclassified 

Security  Classification 


DD  1473 


t 


unciassii'  lea 

Security  Classification 


INSTRUCTIONS 


I.  ORIGINATING  ACTIVITY:  Enter  the  name  and  address 
of  the  contractor,  subcontractor,  grantee.  Department  of  De¬ 
fense  activity  or  other  organization  (corporate  author)  issuing 
the  report. 

2a.  REPORT  SECURITY  CLASSIFICATION:  Enter  the  ov«- 
all  security  classification  of  the  report.  Indicate  whether 
“Restricted  Data”  is  included.  Marking  is  to  be  in  accord¬ 
ance  with  appropriate  security  regulations. 

2b.  GROUP:  Automatic  downgrading  is  specified  in  DoD  Di¬ 
rective  5200.10  and  Armed  Forces  Industrial  Manual.  Enter 
the  group  number.  Also,  when  applicable,  show  that  optional 
markings  have  been  used  for  Group  3  and  Group  4  as  author¬ 
ized. 

3.  REPORT  TITLE:  Enter  the  complete  report  title  in  all 
capital  letters.  Titles  in  all  cases  should  be  unclassified. 

If  a  meaningful  title  cannot  be  selected  without  classifica¬ 
tion,  show  title  classification  in  all  capitals  in  parenthesis 
immediately  following  the  title. 

4.  DESCRIPTIVE  NOTES:  If  appropriate,  enter  the  type  of 
report,  e.g.,  interim,  progress,  summary,  annual,  or  final. 

Give  the  inclusive  dates  when  a  specific  reporting  period  is 
covered. 

5.  AUTHOR(S):  Enter  the  namefs)  of  authors)  as  shown  on 
or  in  the  report.  Enter  last  name,  first  name,  middle  initial. 

If  military,  show  rank  and  branch  of  service.  The  name  of 
the  principal  •uthor  is  an  absolute  minimum  requirement. 

6.  REPORT  DATE.  Enter  the  date  of  the  report  as  day, 
month,  year,  or  month,  year.  If  more  than  one  date  appears 
on  the  report,  use  date  of  publication. 

7a.  TOTAL  NUMBER  OF  PAGES:  The  total  page  count 
should  follow  normal  pagination  procedures,  ue.,  enter  the  I 
number  of  pages  containing  information. 

7b.  NUMBER  OF  REFERENCES:  Enter  the  total  number  of 
references  cited  in  the  report- 

8a.  CONTRACT  OR  GRANT  NUMBER:  If  appropriate,  enter 
the  applicable  number  of  the  contract  or  grant  under  which 
the  report  was  written. 

8b.  8c,  &  8 d.  PROJECT  NUMBER:  Enter  the  appropriate  j 
military  department  identification,  such  as  project  number,  | 
subproject  number,  system  numbers,  task  number,  etc.  i 

9a.  ORIGINATOR’S  REPORT  NUMBER(S):  Enter  the  off:-  i 
cial  report  r—mber  by  which  the  document  will  be  identified 
and  controlled  by  the  originating  activity.  This  number  must 
be  unique  to  this  report. 

9b.  OTHER  REPORT  NUMBER(S):  If  the  report  has  been 
assigned  any  other  report  numbers  (cither  by  the  originator 
or  by  the  sponsor  .  -Iso  enter  this  numberfs). 

10.  AVAIL ABILFTY/LINQTATION  NOTICES:  Enter  any  lim¬ 
itations  on  further  dissemination  of  the  report,  other  than  those 


imposed  by  security  classification,  using  standard  statements 
such  as: 

(1)  “Qualified  requesters  may  obtain  copies  of  this 
report  from  DDC.’’ 

(2)  "Foreign  announcement  and  dissemination  of  this 
report  by  DDC  is  not  authorized. ’’ 

(3)  "U.  S.  Government  agencies  may  obtain  copies  of 
this  report  directly  from  DDC.  Other  qualified  DDC 
users  shall  request  through 


(4)  “U.  S.  military  agencies  may  obtain  copies  of  this 

report  directly  from  DDC  Other  qualified  users 
shall  request  thtough 


(5)  "All  distribution  of  this  report  is  controlled.  Qual¬ 
ified  DDC  users  shall  request  through 


If  the  report  has  been  furnished  to  the  Office  of  Technical 
Services,  Department  of  Commerce,  for  sale  to  the  public,  indi¬ 
cate  this  fact  and  enter  the  price,  if  known. 

1 L  SUPPLEMENTARY  NOTES:  Use  for  additional  explana¬ 
tory  notes. 

12.  SPONSORING  MILITARY  ACTIVITY:  Enter  the  name  of 
the  departmental  project  office  or  laboratory  sponsoring  (pay¬ 
ing  lor)  the  research  and  development.  Include  address. 

13.  ABSTRACT:  Enter  an  abstract  giving  a  brief  and  factual 
summary  of  the  document  indicative  of  the  report,  even  though 
it  may  also  appear  elsewhere  in  the  body  of  the  technical  re¬ 
port.  I:  additional  space  is  required,  a  continuation  sheet  shall 
be  attached. 

It  is  highly  desirable  that  the  abstract  of  classified  reports 
be  unclassified.  Each  paragraph  of  the  abstract  shall  end  with 
an  indication  of  the  military  security  classification  of  the  in¬ 
formation  m  the  paragraph,  represented  as  (TS).  (S).  (C).  or  (V). 

There  is  no  limitation  on  the  length  of  the  abstract.  How¬ 
ever,  the  suggested  length  is  from  150  to  225  words. 

14  KEY  WORDS'  Key  words  are  technically  meaningful  terms 
or  short  phrases  that  characterize  s  report  and  may  be  used  as 
index  entries  for  cataloging  the  report.  Key  words  must  be 
selected  so  that  no  security  classification  is  re-.uired.  Identi¬ 
fiers,  auch  as  equipment  model  designation,  trade  name,  military 
project  code  name,  geographic  location,  may  be  used  as  key 
words  '-ut  ”--»ll  oe  followed  by  an  indication  of  technical  con¬ 
text.  Die  assignment  of  links,  roles,  and  weights  is  optional. 


DD  1473  (BACK) 


Unclassified _ 

Security  Classification 


